Skip to content

Conversation

@zhuqi-lucas
Copy link
Collaborator

  1. Merge upstream DF 49.0.2, and resolve confilcts
  2. Merge df-48-stream branch and resolve confilcts.

mbutrovich and others added 30 commits August 6, 2025 16:32
this is too noisy and not helpful yet, we don't have a fully implemented alternative

(cherry picked from commit 0183244)

Co-authored-by: Adrian Garcia Badaracco <[email protected]>
…ec of the correct size (apache#16995) (apache#17068)

* apache#16994 Ensure CooperativeExec#maintains_input_order returns a Vec of the correct size

* apache#16994 Extend default ExecutionPlan invariant checks

Add checks that verify the length of the vectors returned by methods that need to return a value per child.

(cherry picked from commit 2968331)
* Add ExecutionPlan::reset_state



* Update datafusion/sqllogictest/test_files/cte.slt

* Add reference

* fmt

* add to upgrade guide

* add explain plan, implement in more plans

* fmt

* only explain

---------

Co-authored-by: Robert Ream <[email protected]>
* Preserve equivalence properties during projection pushdown (apache#17129)

* Adds parquet data diffs

---------

Co-authored-by: Matthew Kim <[email protected]>
…17123) (apache#17174)

* Pass the input schema to stats_projection for ProjectionExpr

* Adds a test

* fmt

* clippy

---------

Co-authored-by: Haresh Khanna <[email protected]>
* fix: string_agg not respecting ORDER BY

* Fix equality of parametrizable ArrayAgg function (apache#17065)

The `ArrayAgg` struct is stateful, therefore it must implement
`AggregateUDFImpl::equals` and `hash_value` functions.

* Implement AggregateUDFImpl::equals and AggregateUDFImpl::hash_value for ArrayAgg

* Implement alternative fix

* Remove 'use std::any::Any'

* Add sqllogictest for string_agg plan

* Revert as_any to their original implementations

---------

Co-authored-by: Piotr Findeisen <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* Update to version 49.0.1

* Add changelog for 49.0.1

* Fix sqllogictests

* update configs

* Update with PR

* prettier

* Fix slt race condition

* Tweak release notes
…17274)

* fix: align `array_has` null buffer for scalar (apache#17272)

* fix: align `array_has` null buffer for scalar

* merge
* Update versio to 49.0.2

* Add changelog

* update configuration docs
* Simplify predicates in filter

* add slt test

* Use BtreeMap to make tests stable

* process edge coner

* add doc for simplify_predicates.rs

* add as_literal to make code neat

* reorgnize file

* reduce clone call
…BY clause (apache#16257)

* Add order by clause to limit query for consistent results

* test: update explain plan
// We can make sure that `plan` has an ordering because
// `SortPreservingMergeExec` requires ordering to be constructed.
// If there is no ordering, `SortPreservingMergeExec::new` will panic
let ordering = plan.output_ordering().cloned().unwrap();
Copy link
Collaborator Author

@zhuqi-lucas zhuqi-lucas Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here to use unwrap(), cc @xudong963 @matthewmturner i am not sure if it's 100% safe, need another look, before DF 49, we have a Lex default to apply:

 if fetch.is_some() {
        let ordering = plan
            .output_ordering()
            .cloned()
            .unwrap_or_else(LexOrdering::default);
        let plan = Arc::new(
            SortPreservingMergeExec::new(ordering, plan).with_fetch(fetch.take()),
        );
        optimized_distribution_ctx =
            DistributionContext::new(plan, data, vec![optimized_distribution_ctx]);
    }

But after DF 49, we don't have LexOrdering::default now. So i believe, we can use unwrap directly because when we construct it, it should be there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the spm above here? Just give it a fetch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will address it.

04)------AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]
05)--------ProjectionExec: expr=[]
06)----------CoalesceBatchesExec: target_batch_size=8192
<<<<<<< HEAD
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we resolve the conflicts which this branch not handling before.

SortPreservingMergeExec::new(ordering.clone(), Arc::clone(&input.plan))
.with_fetch(fetch.take()),
SortPreservingMergeExec::new(req.clone(), Arc::clone(&input.plan))
.with_fetch(*fetch),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the take here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it's a mistake change, fetch take should have better performance. Addressed in latest PR.

// We can make sure that `plan` has an ordering because
// `SortPreservingMergeExec` requires ordering to be constructed.
// If there is no ordering, `SortPreservingMergeExec::new` will panic
let ordering = plan.output_ordering().cloned().unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the spm above here? Just give it a fetch

@zhuqi-lucas
Copy link
Collaborator Author

Thank you @xudong963 for review, addressed comments in latest PR.

Copy link
Collaborator

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's go

@xudong963 xudong963 merged commit be0276d into branch-49 Sep 4, 2025
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.